13 research outputs found

    COllective INtelligence with task assignment

    Get PDF
    In this paper we study the COllective INtelligence (COIN) framework of Wolpert et al. for dispersion games (Grenager, Powers and Shoham, 2002) and variants of the EL Farol Bar problem. These settings constitute difficult MAS problems where fine-grained coordination between the agents is required. We enhance the COIN framework to dramatically improve convergence results for MAS with a large number of agents. The increased convergence properties for the dispersion games are competitive with especially tailored strategies for solving dispersion games. The enhancements to the COIN framework proved to be essential to solve the more complex variants of the El Farol Bar-like problem

    COllective INtelligence with sequences of actions

    Get PDF
    The design of a Multi-Agent System (MAS) to perform well on a collective task is non-trivial. Straightforward application of learning in a MAS can lead to sub optimal solutions as agents compete or interfere. The COllective INtelligence (COIN) framework of Wolpert et al. proposes an engineering solution for MASs where agents learn to focus on actions which support a common task. As a case study, we investigate the performance of COIN for representative token retrieval problems found to be difficult for agents using classic Reinforcement Learning (RL). We further investigate several techniques from RL (model-based learning, Q(lambda))Q(lambda)) to scale application of the COIN framework. Lastly, the COIN framework is extended to improve performance for sequences of actions

    Foresighted policy gradient reinforcement learning: solving large-scale social dilemmas with rational altruistic punishment

    Get PDF
    Many important and difficult problems can be modeled as “social dilemmas”, like Hardin's Tragedy of the Commons or the classic iterated Prisoner's Dilemma. It is well known that in these problems, it can be rational for self-interested agents to promote and sustain cooperation by altruistically dispensing costly punishment to other agents, thus maximizing their own long-term reward. However, self-interested agents using most current multi-agent reinforcement learning algorithms will not sustain cooperation in social dilemmas: the algorithms do not sufficiently capture the consequences on the agent's reward of the interactions that it has with other agents. Recent more foresighted algorithms specifically account for such expected consequences, and have been shown to work well for the small-scale Prisoner's Dilemma. However, this approach quickly becomes intractable for larger social dilemmas. Here, we advance on this work and develop a “teach/learn” stateless foresighted policy gradient reinforcement learning algorithm that applies to Social Dilemma's with negative, unilateral side-payments, in the from of costly punishment. In this setting, the algorithm allows agents to learn the most rewarding actions to take with respect to both the dilemma (Cooperate/Defect) and the “teaching” of other agent's behavior through the dispensing of punishment. Unlike other algorithms, we show that this approach scales well to large settings like the Tragedy of the Commons. We show for a variety of settings that large groups of self-interested agents using this algorithm will robustly find and sustain cooperation in social dilemmas where adaptive agents can punish the behavior of other similarly adaptive agents

    Learning from induced changes in opponent (re)actions in multi-agent games

    Get PDF
    Multi-agent learning is a growing area of research. An important topic is to formulate how an agent can learn a good policy in the face of adaptive, competitive opponents. Most research has focused on extensions of single agent learning techniques originally designed for agents in more static environments. These techniques however fail to incorporate a notion of the effect of own previous actions on the development of the policy of the other agents in the system. We argue that incorporation of this property is beneficial in competitive settings. In this paper, we present a novel algorithm to capture this notion, and present experimental results to validate our claim

    A decommitment strategy in a competitive multi-agent transportation setting

    Get PDF
    Decommitment is the action of foregoing of a contract for another (superior) offer. It has been shown that, using decommitment, agents can reach higher utility levels in case of negotiations with uncertainty about future prospects. In this paper, we study the decommitment concept for the novel setting of a large-scale logistics setting with multiple, competing companies. Orders for transportation of loads are acquired by agents of the (competing) companies by bidding in online auctions. We find significant increases in profit when the agents can decommit and postpone the transportation of a load to a more suitable time. Furthermore, we analyze the circumstances for which decommitment has a positive impact if agents are capable of handling multiple contracts simultaneously

    Synopsis van een filmscript

    No full text

    Repeated auctions with complementarities: Extended abstract

    No full text

    Action-reaction in multi-agent games

    No full text
    corecore